Skip to content

Conversation

RuthTurk
Copy link
Member

@RuthTurk RuthTurk commented Sep 13, 2025

📣 Summary

Add a new superuser-only endpoint, /api/v2/user-reports/, to access and filter all user usage data.

💭 Notes

  • Created a new Django app user_reports
  • Used a SQL materialized view along with some Django tables to gather and calculate all the data for the endpoint
  • Our celery task refresh_user_report_snapshots runs every 30 minutes and works to update the data in batches
  • The endpoint is only accessible if Stripe is enabled
  • Renamed api tag "Audit logs (superusers)" to "Server logs (superusers)" (Zulip thread)

📖 Description

This PR introduces a new endpoint /api/v2/user-reports that provides detailed information about users, including their personal details, metadata, organization info, subscriptions, usage statistics, and computed balances. It supports flexible nested filtering, allowing queries like:

/api/v2/user-reports/?q=username__icontains:demo
/api/v2/user-reports/?q=service_usage__total_storage_bytes__gte:1
/api/v2/user-reports/?q=service_usage__balances__submission__balance_value__gte:5000

To handle a large user base, the endpoint leverages a PostgreSQL materialized view that aggregates user-related data for fast retrieval. Since some data (like usage and submissions) lives in another database, a periodic Celery task refresh_user_report_snapshots was added to pre-compute and store this data in a new BillingAndUsageSnapshot table. The materialized view references this snapshot for further computations. A BillingAndUsageSnapshotRun table was also added to track each run and safely resume interrupted jobs.

👀 Preview steps

  1. ℹ️ have a superuser account and multiple regular users in multiple organizations
  2. create projects, make submissions, and buy stripe subscriptions
  3. checkout this branch and run migrations for user_reports
  4. login as a regular user
  5. 🟢 checkout /api/v2/user-reports/ and get a 403 Forbidden error
  6. login as a superuser
  7. run the celery task to refresh the data: refresh-user-report-snapshot
  8. 🟢checkout /api/v2/user-reports/ and get 200 with all the users and data
  9. verify that all data is correct
  10. make some submissions
  11. 🟢 verify that the data has been correctly refreshed

@RuthTurk RuthTurk force-pushed the dev-899-optimize-queries-for-subscriptions branch from 507182e to 3f421d1 Compare September 16, 2025 15:22
@rajpatel24 rajpatel24 force-pushed the dev-899-optimize-queries-for-subscriptions branch from 5e61538 to b8bd62d Compare October 10, 2025 16:42
@rajpatel24 rajpatel24 force-pushed the dev-899-optimize-queries-for-subscriptions branch from 944c064 to 214b7ad Compare October 12, 2025 09:06
@noliveleger noliveleger force-pushed the dev-899-optimize-queries-for-subscriptions branch from b644fdf to a9498f1 Compare October 15, 2025 18:21
@noliveleger noliveleger changed the title feat(api): add /api/v2/user-reports/ endpoint for superusers DEV-232 DEV-899 feat(userReports): add /api/v2/user-reports/ endpoint for superusers DEV-232 DEV-899 Oct 15, 2025
Copy link
Contributor

@noliveleger noliveleger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@RuthTurk RuthTurk merged commit 5ad4774 into main Oct 16, 2025
13 checks passed
@RuthTurk RuthTurk deleted the dev-899-optimize-queries-for-subscriptions branch October 16, 2025 13:20
rajpatel24 added a commit that referenced this pull request Oct 16, 2025
…ts` endpoint DEV-287 DEV-233 (#6342)

### 📣 Summary
Adds robust filtering and ordering capabilities to the
`/api/v2/user-reports` API endpoint, enabling users and admins to query
large user datasets efficiently by date, usage metrics, and subscription
attributes, all backed by the optimized materialized view for scalable
performance on millions of records.

### 📖 Description
This PR introduces the filtering and ordering layer for the
`/api/v2/user-reports` endpoint, built on top of the existing
materialized view (`user_reports_userreportsmv`) that aggregates
user-level billing and usage data.

Also, added targeted indexes in `0003_add_user_reports_mv_indexes.py` on
high-cardinality numeric and timestamp columns to support range queries
without full table scans.

  Examples of usage:
  - Filter by username (case-insensitive, starts with):
    `/api/v2/user-reports/?q=username__icontains:raj`
  - Filter by email (case-insensitive, starts with):
      `/api/v2/user-reports/?q=email:[email protected]`

  - Filter by total storage bytes (greater than or equal to):
`/api/v2/user-reports/?q=service_usage__total_storage_bytes__gte:1`
  - Filter by total storage bytes (less than or equal to):
`/api/v2/user-reports/?q=service_usage__total_storage_bytes__lte:1`

---

Part of #6243

---------

Co-authored-by: RuthShryock <[email protected]>
Co-authored-by: Olivier Léger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API Changes related to API endpoints Back end

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants